⚡ LLM Optimization - jimman · Scour

SDFP: Speculative Decoding with FIT-Pruned Models for Training-Free and Plug-and-Play LLM Acceleration

arxiv.org·2d

⚡Model Efficiency

Quantization-Aware Distillation

ternarysearch.blogspot.com·15h·

Discuss: Hacker News

⚡Model Efficiency

Optimized LLM Inference Engines

rishirajacharya.com·4d

⚡Model Efficiency

Zero-Latency Local AI: Tuning Your Linux Kernel for LLM Inference 🐧🧠

dev.to·1d·

Discuss: DEV

⚡Model Efficiency

Mechanistic Interpretability: Peeking Inside an LLM

towardsdatascience.com·3d

🔍AI Interpretability

Is Your Machine Learning Pipeline as Efficient as it Could Be?

kdnuggets.com·2d

⚡Model Efficiency

How I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

mohammedeabdelaziz.github.io·1d·

Discuss: Hacker News

⚡Model Efficiency

Determining Energy Efficiency Sweet Spots in Production LLM Inference

arxiv.org·2d

⚡Model Efficiency

ML-LIB: Machine Learning Library Proposed For The Linux Kernel

phoronix.com·1d·

Discuss: Hacker News

⚡Model Efficiency

Understanding LLM Inference Engines: Inside Nano-vLLM (Part 2)

neutree.ai·2d·

Discuss: Hacker News

⚡Model Efficiency

From Prediction to Compilation: A Manifesto for Intrinsically Reliable AI

news.ycombinator.com·5h·

Discuss: Hacker News

✍️Prompt Engineering

Finding the needle in the logstack: Reducing LLM context with TF-IDF

eliseomartelli.it·2d

⚡Model Efficiency

Why Files Are Not Enough as Memory for AI Agents

medium.com·6h·

Discuss: Hacker News

✍️Prompt Engineering

chatprd.ai·55m

🔍AI Interpretability

Fastfood: Approximate Kernel Expansions in Loglinear Time

dev.to·16h·

Discuss: DEV

⚡Model Efficiency

Show HN: Model Training Memory Simulator

czheo.github.io·8h·

Discuss: Hacker News

⚡Model Efficiency

NotebookLM: The AI that only learns from you

byandrev.dev·22h·

Discuss: Hacker News

🔍AI Interpretability

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

github.com·1d·

Discuss: Hacker News, r/LocalLLaMA

⚡Model Efficiency

[RFC PATCH v1 0/4] Machine Learning (ML) library in Linux kernel

lore.kernel.org·1d·

Discuss: Lobsters, Hacker News

Fast Autoscheduling for Sparse ML Frameworks

ajroot.pl·3d·

Discuss: Hacker News, r/Compilers

⚡Model Efficiency

Loading more...